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Simmary 


A relatively  large  body  of  research  indicates  that  people  are  con- 
servative processors  of  probabilistic  information.  Recent  attention  has 
focussed  (»i  two  possible  explanations  of  this  phenomenon.  The  misaggre- 
gation  hypothesis  depicts  conservatism  as  an  inability  to  properly  combine 
the  information  in  sequences  of  data.  The  other  explanation  suggests 
conservatism  is  the  result  of  a response  bias:  the  avoidance  of  extreme 
odds  or  probability  judgments. 

TVo  experiments  explored  the  use  of  a specific  response,  average 
certainty,  that  was  devised  to  thwart  conservatism  caused  by  either  a 
response  bias  or  a certain  form  of  misaggregation.  Use  of  appropriate 
instructions  and  response  scales  made  the  average  certainty  judgments  good 
subjective  assessments  of  the  arithmetic  mean  likelihood  ratio  which  could 
then  be  used  in  the  appropriate  form  of  Bayes'  Theorem  to  calculate  poste- 
rior odds.  These  judgments  seemed  unlikely  to  be  affected  by  a response 
bias  since  extreme  responses  were  not  needed.  In  addition,  research  has 
suggested  that  people  are  more  likely  to  aggregate  information  by  averag- 
ing than  by  adding  or  multiplying,  so  misaggregation  may  be  exhibited  only 
in  specific  forms  of  aggregation  and  may  not  be  present  in  averaging. 

The  results  of  Experiment  I indicated  that  average  certainty  judgments 
were  both  more  orderly  and  more  veridical  than  cumulative  certainty  judg- 
ments of  the  type  usually  obtained  in  probabilistic  inference  tasks.  The 
cumulative  judgments  were  very  conservative  \diile  the  average  certainty 
judgments  were  only  slightly  radical.  Experiment  II  indicated  that 
average  certainty  judgments  and  individual  likelihood  ratio  judgments  were 


both  more  orderly  and  veridical  than  cumulative  certainty  judgments  but 
that  they  did  not  differ  significantly  from  each  other  in  either  orderli- 
ness or  veridicality.  A second  factor,  the  diagnosticity  level  of  the 
data  was  also  found  to  influence  the  veridicality  of  obtained  judgments. 
Regardless  of  the  method  of  aggregation  enployed,  estimates  became  more 
veridical  as  the  data  became  more  diagnostic.  Since  these  studies  were 
undertaken  only  to  see  if  average  certainty  judgments  are  an  effective 
way  to  reduce  conservatism,  they  do  not  directly  test  vdiat  causes  conser- 
vatism. However,  some  implications  concerning  the  nature  of  conservatism 
are  discussed,  as  are  the  in^>lications  for  the  technology  of  probabilistic 
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Disclaimer 


The  views  and  conclusions  contained  in  this  document  are  those  of  the 
authors  and  should  not  be  interpreted  as  necessarily  representing  the 
official  policies,  either  expressed  or  inqjlied,  of  the  Advanced  Research 
Projects  Agency  or  of  the  United  States  Government. 


I . Introduction 


A relatively  large  body  of  literature  (see  reviews  by  DuCharme,  1969; 
Slovic  and  Lichtenstein,  1971)  supports  the  assertion  that  people  are  con- 
servative processors  of  probabilistic  information.  When  presented  with 
data,  individuals  typically  revise  their  opinion  to  an  extent  less  than 
prescribed  by  Bayes'  Theorem,  the  formally  appropriate  model.  In  its 
simplest  form  with  two  hypotheses  and  conditionally  independent  data, 
Bayes'  Theorem  takes  the  form: 


fin  ^0 


n 

n 

i=l 


L. 


(1) 


where  fip  is  the  prior  odds,  is  the  likelihood  ratio  for  the  ith  datum, 
and  fi^  is  the  revised  (posterior)  odds. 

Three  hypotheses  have  been  advanced  to  explain  conservative  human 
inference  (Edwards,  1968).  The  misaggregation  hypothesis  depicts  conser- 
vatism as  an  inability  to  properly  combine  the  information  present  in 
sequences  of  data,  although  each  single  datum  is  judged  accurately.  A 
second  hypothesis- -the  misperception  hypothesis- -argues  that  people  aggre- 
gate information  properly  (that  is,  according  to  Bayes'  Theorem),  but 
incorrectly  diagnose  the  information  in  a single  datum.  The  final  expla- 
nation considers  conservatism  the  result  of  a response  bias:  the  avoidance 
of  extreme  odds  or  probability  judgments. 

Recent  attention  has  focussed  primarily  on  the  misaggregation  and 
response  bias  hypotheses,  since  Wheeler  and  Edwards  (1975)  demonstrated 
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rather  convincingly  that  misperception  played  little,  if  any,  role  in  pro- 
ducing conservatism.  DuCharme  (1970)  provides  the  main  support  for  the 
response  bias  explanation.  He  found  aggregated  posterior  odds  judgments 
to  be  near  veridical  for  non-extreme  odds,  while  unaggregated  likelihood 
ratio  judgments  were  conservative  when  the  true  likelihood  ratio  was 
relatively  extreme.  IVheeler  and  Edwards  refute  some  of  DuCharme 's  findings, 
providing  persuasive  sipport  for  the  misaggregation  hypothesis.  Still 
they  conclude  that 

"in  all  likelihood  misperception,  misaggregation,  and  re- 
sponse biases  all  contribute  to  conservatism.  The  real 
questions  of  inportance  then  becomes  finding  tlie  manner 
in  which  each  phenomenon  contributes  to  conservatism  and 
the  best  way  of  avoiding  or  compensating  for  this  non- 
optimal  behavior."  (p.  10) 

One  strategy  that  could  lead  to  reduction  or  elimination  of  the  bias 
against  extreme  responses  is  to  remove  the  need  for  extreme  responses:  that 
is,  to  find  some  type  of  judgment  that  conveys  the  necessary  information 
but  for  which  the  numerical  response  is  not  extreme.  Consideration  of 
these  requirements  for  posterior  odds  judgments  suggests  some  sort  of 
average  judgment.  The  responses  required  for  average  judgments  have  the 
advantage  that  the  aggregated  judgment  falls  within  the  range  of  the  in- 
puts to  the  judgment,  as  opposed  to  the  usual  posterior  odds  judgments 
in  which  the  veridical  response  will  involve  multiplication  of  likelihood 
ratios,  typically  making  the  response  much  larger  than  the  inputs. 

Reviewing  alternative  forms  of  Bayes’  Theorem  (equation  1)  suggests 
two  possible  types  of  judgments  that  convey  all  the  information  necessary 
and  that  are  also  averages  of  some  sort.  The  geometric  mean  of  the  like- 
lihood ratios  is  one  possibility.  However,  geometric  means  seem  to  be 


a difficult  judgment  for  people  to  make.  In  addition,  if  subjects  substi- 
tuted arithmetic  means  for  the  geometric  means- -a  seemingly  likely 
occurrence- -the  results  would  be  biased  away  fran  conservatism  since  the 
arithmetic  mean  is  always  larger  than  the  geometric  mean. 

Fortunately,  a second  possibility  exists.  Taking  base  ten  logarithms 
in  equation  (1)  yields 


n 

log  n = log  ^0  + i:  log  L.  (2) 

The  judgment  that  would  play  a role  in  equation  (2)  equivalent  to  that 
played  by  the  geometric  mean  likelihood  ratio  is  the  arithmetic  mean  log 
likelihood  ratio.  If  subjects  assess  arithmetic  mean  log  likelihood 
ratios,  AMLL's,  the  posterior  odds  will  be 

(3) 

A review  of  descriptive  studies  exploring  how  people  process  information 
lends  additional  si^port  to  the  use  of  the  arithmetic  mean  log  likelihood 
ratio  as  a normative  procedure  for  processing  probabilistic  information. 
People  have  been  shown  to  use  averaging  rather  than  adding  or  multiplying 
as  a method  for  combining  information  in  a wide  variety  of  contexts,  for 
exairple  in  determining  the  overall  value  of  products  (Troutman  and 
Shanteau,  1976) , in  deciding  how  well  they  would  like  a person  described 
by  personality  trait  adjectives  (Andersen,  1965),  and  in  predicting  a 
criterion  number  on  the  basis  of  two  cue  numbers  (Lichtenstein,  Earle, 


and  Slovic,  1975). 

Probably  the  most  relevant  descriptive  studies  are  those  of  Shanteau 
(1975)  and  Troutman  and  Shanteau  (1977)  in  which  they  used  a task  very 
similar  to  that  enployed  in  conseivatism  experiments.  In  both  studies,  an 
averaging  model  provided  a much  better  fit  to  the  data  than  did  a multi- 
plying model,  which  is  the  appropriate  model  if  subjects  actually  process 
information  in  a manner  consistent  with  Bayes*  Theorem. 

Certainly  these  descriptive  results  should  be  considered  in  designing 
normative  procedures.  If  indeed,  as  seems  likely,  people  do  tend  to 
average  information,  then  a normative  procedure  taking  advantage  of  this 
tendency  may  produce  better  results  than  one  \diich  requires  an  alternative 
form  of  processing.  Perhaps  conservatism  results  from  the  specific  manner 
in  which  subjects  are  required  to  aggregate  information.  That  is,  people 
may  not  be  accustomed  to  using  addition  or  multiplication  as  aggregation 
procedures  for  processing  information. 

The  present  experiments  explore  the  possibility  of  using  subjective 
judgments  of  the  arithmetic  mean  log  likelihood  ratio  to  reduce  conserva- 
tism in  aggregated  judgments  of  certainty.  These  aggregated  judgments 
are  also  coinjared  against  unaggregated  likelihood  ratio  judgments. 

II.  Experiment  I 


II. 1.  Method 

II. 1.1.  Apparatus.  The  stimuli  for  Experiment  I were  6.5  inch  (16.51 
cm)  pick-up  sticks  painted  yellqw  on  one  end  and  blue  on  the  other.  The 
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length  of  yellow  (or,  because  of  symmetry,  the  length  of  blue)  was  the 
random  variable.  The  sticks  shown  subjects  were  hypothetically  drawn  from 
two  normally  distributed  populations  differing  only  in  mean  length  of 
yellow.  One  population  (the  predominantly  yellow)  had  a mean  of  4.05 
inches  (10.287  cm)  of  yellow  and  the  other  (the  predominantly  blue)  a mean 
of  2.45  inches  (6.223  cm).  The  populations  shared  a common  standard  devi- 
ation of  1.0  inches  (2.54  cm). 

Each  stick  was  displayed  on  a white  rectangular  card.  Eight  sequences 
of  four  sticks  were  randomly  selected  for  use  as  stimuli.  Half  the  sequences 
were  drawn  from  the  predominantly  blue  population  while  the  other  half  were 
drawn  fran  the  predominantly  yellow  population,  so  prior  odds  were  always 
1:1. 

The  population  characteristics  were  displayed  to  the  subjects  by  means 
of  two  random  histograms,  each  representing  a sample  of  99  sticks  from  one 
of  the  two  populations.  The  lengths  displayed  had  been  carefully  chosen 
to  accurately  represent  the  populations.  The  displays  were  actual  size 
and  colors,  and,  on  each,  the  population  mean  was  displayed  by  a heavy 
black  horizontal  line  at  the  appropriate  position.  Both  displays  were 
present  throughout  the  experiment.  All  responses  were  made  on  sheets  of 
paper  v4iich  contained  space  at  the  top  for  the  subject  to  check  the  more 
likely  population  along  with  four  logarithmically  spaced  scales  ranging 
from  1:1  odds  (designated  the  "uncertainty"  point)  to  odds  of  10,000:1 
and  1:10,000. 

11.1.2.  Subjects.  TVenty  naive  subjects,  run  in  groups  of  three  or 
four,  were  drawn  from  an  introductory  undergraduate  psychology  course 
taught  at  the  Ihiversity  of  Southem  California. 
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II. 1.5.  Procedure.  Subjects  were  randomly  assigned  to  one  of  two 
groups:  one  group  made  average  certainty  judgments,  while  the  other  judged 
cumulative  certainty.  For  the  group  judging  av'erage  certainty,  the  sub- 
jects were  presented  with  the  first  stick  and  told  to  indicate,  at  the  top 
of  the  scale,  their  choice  of  the  more  likely  population,  after  which  they 
were  to  designate,  on  the  first  scale  column,  the  odds  corresponding  to 
their  subjective  certainty.  As  the  second  stick  was  placed  alongside  the 
first,  subjects  were  told  that  their  responses  should  represent  a judgment 
as  to  the  "average  of,  rather  than  the  total  of,  their  certainty".  Each 
subject  was  told  to  use,  as  a reference  in  responding,  the  position  of 


the  point  on  the  first  scale  column.  Detailed  examples  emphasizing  the 
process  that  should  take  place  in  arriving  at  an  average  certainty  judg- 
ment were  provided.  Subjects  were  told  that  this  procedure  would  continue 
with  the  addition  of  the  third  and  fourth  sticks,  after  which  a new  response 
sheet  would  be  provided  for  a new  sequence  of  four  sticks.  Each  subject 
was  also  told  that  the  same  process  was  employed,  but  that  direction 
changed  toward  the  bottom  half  of  the  scale  in  the  event  that  the  subject 
changed  his  or  her  belief  about  which  population  was  more  likely  to  have 
produced  the  sticks  in  view. 

The  instructions  given  to  each  subject  in  the  group  judging  cumulative 
certainty  differed  from  those  described  above  in  one  important  way.  As  the 
second  stick  was  placed  alongside  the  first,  the  subject  was  asked  to 
represent  his  or  her  cumulative  certainty  on  the  second  scale  coliimn. 

This  cumulative  certainty  was  to  be  judged  relative  to  the  certainty  after 
seeing  the  previous  stick  and  was  to  be  represented  by  moving  distance 
(away  from  the  "uncertainty"  line)  on  the  second  scale  column.  Examples 


b 
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of  this  procedure  were  provided  in  the  instructions. 

Having  read  through  one  set  of  instructions  with  the  subjects,  the 
experimenter  provided  four  sequences  of  three  sticks  each  as  examples. 
Subjects  responded  to  the  example  stimuli  and  general  feedback  was  pro- 
vided. Then  eight  sequences  of  four  sticks  were  presented  and  32  point 
placements  were  made  on  eight  sets  of  four  scale  colimms. 

I I. 2.  Results  and  Discussion 

The  data  were  first  subjected  to  a logarithmic  transformation  and  the 
average  certainty  judgment  responses  in  logarithmic  form  were  each  multi- 
plied by  n,  the  number  of  sticks  on  which  the  judgment  was  based.  All 
analyses  were  performed  on  the  log  transformed  responses  and  the  dependent 
variable  was  the  log  posterior  odds  that  were  inferred  from  the  subjects' 
responses  to  the  sequences  of  sticks. 

For  each  subject,  a regression  analysis  of  inferred  log  posterior  odds 
on  veridical  log  posterior  odds  was  performed.  The  correlation  coefficients 
and  slopes  from  these  analyses  are  presented  in  Table  1.  Judgments  in  the 
average  certainty  condition  were  significantly  more  orderly  (t(18)  = 3.975 
p < .001  on  Fisher- z transformed  correlation  coefficients)  as  reflected  in 
the  mean  correlation  coefficients  of  .896  and  .689  for  the  average  and 
cumulative  certainty  response  groups  respectively.  Posterior  odds  inferred 
from  the  averaging  condition  tended  to  be  slightly  radical  fmean  slope 
= 1.264),  while  the  posterior  odds  obtained  from  the  cumulative  condition 
were  extremely  conservative  (mean  slope  = 0.283).  This  difference  was 
significant,  t (18)  = 6.115,  p < .001,  in  the  predicted  direction.  Although 
average  certainty  judgments  were  slightly  radical,  they  were  much  closer 
to  veridical  than  the  cumulative  certainty  judgments. 
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TABLE  1 


Correlations  and  Slopes  for  Inferred  Posterior  Odds 

I 


Average  Certainty 

Cumulative  Certainty 

r b 

r b 

.882 

1.068 

.598 

0.164 

.934 

1.543 

.809 

0.371 

.922 

0.951 

.898 

0.509 

.808 

0.900 

.864 

0.356 

.925 

0.501 

.748 

0.329 

.934 

0.898 

.415 

0.330 

.871 

1.899 

.479 

0.108 

.909 

1.463 

.601 

0.147 

.882 

2.110 

.571 

0.155 

.896 

1.295 

.906 

0.358 

mean  .896 

1.264 

.689 

0.283 

standard 
deviaticm  .038 

0.496 

.179 

0.131 

1 


8 


The  results  of  this  study  were  encouraging  for  the  use  of  average 
certainty  judgments  as  a means  of  assessing  subjective  certainty.  Subjects 
apparently  can  make  this  type  of  judgment  and,  at  least  under  the  conditions 
of  this  experiment,  make  such  judgments  quite  well.  Since  the  feasibility 
of  this  type  of  judgment  seems  to  have  been  established,  further  questions 
need  to  be  pursued.  Specifically,  do  these  results  generalize  to  other 
levels  of  data  diagnosticity?  And  how  does  the  veridicality  of  average 
certainty  judgments  compare  with  that  of  non- aggregated  likelihood  ratio 
judgments?  These  questions  are  addressed  in  Experiment  II. 

III.  Experiment  II 

III.l.  Method 

The  method  used  in  Experiment  1 1 was  the  same  as  that  used  in  Experiment 
I with  two  exceptions:  three  levels  of  data  diagnosticity  were  used,  and 
a third  type  of  uncertainty  judgment- -single- datum  likelihood  ratios--was 
also  examined.  The  two  pairs  of  normal  distributions  used,  in  addition  to 
the  original  pair  had  mean  yellow  lengths  of  4.35  inches  (11.149  cm)  and 
3.75  inches  (9.525  cm)  for  the  predcsninantly  yellow  populations,  and 
2.15  inches  (5.461  cm)  and  2.75  inches  (6.985  cm)  for  the  predonimantly 
blue  populations.  Thus,  the  three  levels  of  data  diagnosticity,  d'  = 

(m^  - m2)/a,  were  2.2,  1.6,  and  1.0. 

A three  by  three  factorial  design  was  created  by  crossing  the  three 
types  of  uncertainty  judgments  with  the  three  levels  of  data  diagnosticity. 
Ninety  subjects,  ten  per  cell,  were  randomly  assigned  to  the  experimental 
conditions  and  were  run  in  groups  of  four  or  five.  A few  subjects  were 
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run  individually  to  obtain  equal  cell  sizes.  The  subjects  were  from  an 
introductory  psychology  course  at  the  University  of  Southern  California  in 
which  participation  in  several  experiments  was  required. 

III. 2.  Results 

A regression  analysis  was  performed  on  the  individual  data  of  each 
subject.  For  subjects  making  average  or  cimilative  certainty  judgments, 
inferred  log  posterior  odds  were  regressed  on  veridical  log  posterior  odds, 
vdiile  for  subjects  making  likelihood  ratio  judgments,  subjective  likelihood 
ratios  were  regressed  on  veridical  likelihood  ratios.  The  correlation  co- 
efficients and  slopes  from  these  analyses  are  presented  in  Table  2. 

Results  of  analyses  of  variance  performed  on  the  slopes  and  Fisher- z 
transformed  correlation  coefficients  are  shown  in  Tables  3 and  4 respective- 
ly. Inspection  of  the  cell  means  of  the  slopes,  plotted  in  Figure  1(a), 
indicated  little  difference  between  the  likelihood  ratio  judgments  (mean 
= 1.359)  and  the  average  certainty  judgments  (mean  = 1.366),  both  being 
slightly  radical.  The  cumulative  certainty  judgments,  however,  were  again 
extremely  conservative  (mean  = . 365) . Slopes  generally  tended  to  decrease 
as  d'  increased. 

The  significant  interaction  was  unexpected,  but  may  be  due  to  a problem 
on  the  first  day  of  experimentation.  Elimination  of  the  five  subjects  run 
the  first  day  (average  certainty  judgments  at  d'  “ 1.6)  eliminated  the 
interaction  as  illustrated  in  Figure  1(b).  These  five  subjects  all  had 
higher  slopes  than  subsequent  subjects  run  in  this  condition.  Perhaps 
these  subjects  did  not  correctly  understand  the  nature  of  the  judgments 
they  were  asked  to  make. 


TABLE  2 


Correlations  and  Slopes  for  Average  Certainty, 
Cumulative  Certainty,  and  Likelihood  Ratio  Judgments 

d'  = 1.0 


Average  Certainty 

Cumulative  Certainty  Individual  Likelihood  Ratio 

r b 

r b r b 

498 

1.194 

.893 

0.834 

.741 

1.878 

351 

1.449 

.613 

0.420 

.831 

1.800 

844 

2.538 

.852 

0.565 

.778 

1.985 

566 

1.443 

.640 

0.282 

.900 

2.585 

863 

2.143 

.913 

0.726 

.828 

1.976 

.240 

0.461 

.870 

0.385 

.753 

1.188 

,842 

2.414 

.838 

0.518 

.850 

1.811 

,819 

1.560 

.752 

0.407 

.464 

1.038 

,908 

1.291 

.729 

0.507 

.896 

2.052 

.761 

1.586 

.712 

0.310 

.763 

1.352 

mean  . 669 

1.608 

.781 

.495 

.780 

1.767 

standard 
deviation’ ^ 

.619 

.107 

.176 

.125 

.459 
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TABLE  2 
(continued) 

Correlations  and  Slopes  for  Average  Certainty, 
Cumulative  Certainty,  and  Likelihood  Ratio  Judgments 


d'  = 1.6 


Average  Certainty  Cumulative  Certainty  Individual  Likelihood  Ratio 


.822 

2.583 

.931 

0.582 

.572 

0.623 

.932 

2.606 

.576 

0.055 

.797 

1.335 

.928 

1.788 

.716 

0.250 

.855 

1.110 

.773 

1.580 

.749 

0.237 

.642 

0.877 

.912 

2.051 

.739 

0.239 

.811 

1.780 

.864 

0.965 

.575 

0.139 

.901 

2.120 

.939 

0.667 

.758 

0.405 

.904 

1.703 

.914 

0.877 

.393 

0.099 

.819 

1.034 

.860 

0.948 

.877 

0.256 

.751 

-0.333 

.848 

1.341 

.646 

0.160 

.875 

1.110 

mean  . 879 

1.541 

.696 

.242 

.793 

1.136 

standard  .055 

.705 

.156 

.155 

.110 

.687 

deviation 


TABLE  2 
(continued) 


Correlations  and  Slopes  for  Average  Certainty, 
Cumulative  Certainty,  and  Likelihood  Ratio  Judgments 

d’  = 2.2 


Average  Certainty 

Cumulative  Certainty  Individual  Likelihood  Ratio 

r b 

r b r b 

.776 

0.785 

.813 

0.587 

.841 

1.542 

.830 

0.643 

.689 

0.355 

.901 

1.156 

.813 

0.915 

.838 

0.464 

.929 

1.036 

.896 

0.964 

.881 

0.472 

.927 

1.328 

.813 

1.359 

.641 

0.267 

.959 

1.374 

.889 

1.107 

.750 

0.412 

.938 

1.335 

.701 

1.197 

.564 

0.210 

.528 

0.601 

.950 

0.869 

.786 

0.364 

.799 

1.033 

.676 

0.770 

.608 

0.113 

.890 

1.155 

.847 

0.914 

.779 

0.323 

.806 

1.195 

mean  .819 

.952 

.735 

.357 

.852 

1.176 

standard 
deviation  .085 

.215 

.105 

.138 

.127 

.257 
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TABLE  3 


ANOVA  of  Slopes 


Source 

Sums  of  Squares 

Mean  Squares 

E 

Within  Cells 

15.731 

81 

0.194 

Aggregation  Tvlethod 

19.931 

2 

9.996 

0.001 

Diagnosticity  Level 

3.346 

2 

1.673 

0.001 

Aggregation  x Diagnosticity 

2.070 

4 

0.518 

0.038 

TABLE  4 

ANOVA  of  Correlations 

Source 

Sums  of  Squares  df 

Mean  Squares 

E. 

Within  Cells 

8.224  81 

0.102 

Aggregation  Method 

0.746  2 

0.373 

0.030 

Diagnosticity  Level 

0.401  2 

0.201 

0.146 

Aggregation  x Diagnosticity 

1.540  4 

0.385 

0.007 

The  avera^  certainr/  and  likelihood  ratio  judgments  also  appeared  to 
be  more  orderly  than  the  cumulative  certainty  judgments  as  measured  by  the 
correlations.  Data  diagnosticity  did  not  significantly  affect  the 
correlations. 

IV.  Discussion 

The  results  of  these  experiments  suggest  that  aggregated  judgments  of 


generally  allowed  subjective  judgments  to  be  veridical  only  if  the  aggre- 


gation method  was  multiplication.  In  scrnie  cases,  proper  use  of  the  response 
scale  could  make  the  aggregation  additive.  For  example,  if  the  response 
scale  presents  logarithmically  spaced  odds,  the  normative  aggregation 
method  will  be  additive  in  distances  on  the  scale. 

However,  in  this  study  use  of  averaging  as  an  aggregation  method  was 
demonstrated  to  be  a viable  method  that  could  be  taught  to  subjects  and 
not  result  in  conservatism.  Specifically,  subjects  were  asked  to  make 
arithmetic  mean  log  likelihood  ratio  judgments,  although  of  course,  these 
precise  words  were  never  used.  The  posterior  odds  resulting  from  the 
judgments  were  very  near  veridical;  while  posterior  odds,  judged  in  the 
usual  way  by  asking  for  cumulative  certainty,  were  very  conservative. 

Either  or  both  of  two  factors  may  account  for  the  lack  of  conservatism 
in  the  average  judgments.  The  aggregated  responses  necessary  for  the 
average  judgments  fall  within  the  range  of  the  single  datum  likelihood  ratios 
that  are  inputs  to  the  aggregation,  and  therefore,  are  not  extreme  responses. 
Tlius,  the  bias  against  extreme  responses  may  be  reduced  or  eliminated.  The 
second  factor  is  singly  the  aggregation  process  that  subjects  need  to 
en5)loy  to  be  veridical.  In  many  situations  people  have  been  shown  to  use 
averaging  rather  than  some  other  method  of  aggregating  multiple  pieces  of 
infomation.  In  particular,  the  work  by  Shanteau  (Shanteau,  1975;  Troutman 
and  Shanteau,  1977)  has  shown  that  people  average  the  information  in  mul- 
tiple saii5)les  of  data  on  a task  very  similar  to  the  one  used  in  this  study. 
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Since  averaging  appears  to  be  the  natural  method  of  aggregation  found  in 
these  descriptive  studies,  it  seems  reasonable  that  normative  information 


processing  based  on  averaging  would  outperform  other  methods  of  aggregation, 
a hypothesis  consistent  with  the  results  of  this  study. 

These  findings  have  inplications  for  the  further  development  of  the 
technology  of  inference.  Because  people  typically  have  proved  to  be  con- 
servative processors  of  information,  researchers  have  looked  for  methods 
of  obtaining  the  desired  information  from  people  without  the  elicited 
judgments  being  affected  by  conservatism.  Probabilistic  information 
processing  (PIP)  systems  were  developed  for  this  specific  purpose  (Edwards, 
Phillips,  Hays,  and  Goodman,  1968).  In  a PIP  system,  people  make  only 
likelihood  ratio  judgments,  a task  this  study  and  others  (Wheeler  and 
Edwards,  1975)  show  they  can  perform  quite  well.  Bayes'  Theorem  is  then 
used  to  combine  the  likelihood  ratio  judgments  in  the  proper  manner  to 
produce  posterior  odds. 

Since  people  are  able  to  judge  likelihood  ratios  quite  accurately, 
why  even  consider  average  certainty  judgments  as  an  alternative?  This 


study  certainly  did  not  show  that  average  certainty  judgments  are  superior 
to  likelihood  ratio  judgments.  The  reason  for  considering  an  alternative 
to  likelihood  ratio  judgments  is  that  a problem  may  arise  in  applying  PIP 
systems  in  real  world  contexts.  The  people  assessing  the  likelihood  ratios 
will  typically  have  access  to  feedback  about  the  posterior  odds  that  are 
calculated  from  their  likelihood  ratios.  Goodman  (1973),  in  a reanalysis 
of  data  from  five  studies  exploring  methods  of  eliciting  judgments  about 
uncertain  events,  concludes  that  feedback  about  the  inqplications  of  judg- 
ments makes  them  less  extreme  and  is  probably  the  most  powerful  variable 
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controlling  the  extremeness  of  the  judgments.  Thus,  even  a PIP  system  may 
be  susceptible  to  conservatism  in  real-world  applications.  This  problem 
seems  less  likely  to  characterize  judgments  of  average  certainty  due  to 
the  very  nature  of  the  elicited  judgments.  Should  further  research  confirm 
feedback  produced  conservatism  in  PIP  systems,  average  certainty  judgments 
may  prove  to  be  a useful  alternative  to  PIP. 
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two  possible  explanations  of  this  phenomenon.  The  misaggregation  hypothesis 
depicts  cOTservatism  as  an  inability  to  properly  combine  the  information  in 
sequence  of  data.  The  other  explanation  suggests  conservatism  is  the  result 
of  a response  bias:  the  avoidance  of  extreme  odds  or  probability  judgments. 
TVro  experiments  explored  the  use  of  a specific  response,  average  certainty 
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